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Abstract. In this work we find a new formula for matrix averages over 
the Gaussian ensemble. Let H be an n x n Gaussian random matrix 
with complex, independent, and identically distributed entries of zero 
mean and unit variance. Given an n x n positive definite matrix A, and 
a continuous function / : R""" — > R such that dt < oo for 

every a > 0, we find a new formula for the expectation E[Tr(/(HAH*))]. 
Taking f{x) — log(l + x) gives another formula for the capacity of the 
MIMO communication channel, and taking f{x) = (1 + x)~^ gives the 
MMSE achieved by a linear receiver. 



Random Matrices, Limiting Distribution, Gaussian Averages, MIMO Ca- 
pacity, MMSE 

1. Introduction 

Random matrix theory was introduced to the theoretical physics community 
by Wigner in his work on nuclear physics in the 1950s ( \23\ I24j). Since that 
time, the subject is an important and active research area in mathemat- 
ics and it finds applications in fields as diverse as the Riemann conjecture, 
physics, chaotic systems, multivariate statistics, wireless communications, 
signal processing, compressed sensing and information theory. In the last 
decades, a considerable amount of work has emerged in the communica- 
tions and information theory on the fundamental limits of communication 
channels that makes use of results in random matrix theory |1H [TUl [1] . For 
this reason, computing averages over certain matrix ensembles becomes ex- 
tremely important in many situations. To be more specific, consider the 
well known case of the single user MIMO channel with multiple transmit 
and receive antennas. Denoting the number of transmitting antennas by t 
and the number of receiving antennas by r, the channel model is 

y = Hu -|- n 

where u E C* is the transmitted vector, y G C is the received vector, H 
is a r X t complex matrix and n is the zero mean complex Gaussian vector 
with independent, equal variance entries. We assume that E(nn*) = 1^, 
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where (•)* denotes the complex conjugate transpose. It is reasonable to put 
a power constraint 

E(u*u) = E(Tr(uu*)) < P 

where P is the total transmitted power. The signal to noise ratio, denoted 
by snr, is defined as the quotient of the signal power and the noise power 
and in this case is equal to P/r. 

Recall that if A is an n x n Hermitian matrix then there exists U unitary and 
D = diag((ii, . . . , d„) such that A = UDU*. Given a continuous function / 
we define /(A) as 

/(A) = Udiag(/((ii),...,/K))U*. 

Naturally, the simplest example is the one where H has independent and 
identically distributed (i.i.d.) Gaussian entries, which constitutes the canon- 
ical model for the single user narrow band MIMO channel. It is known that 
the capacity of this channel is achieved when u is a complex Gaussian zero 
mean and covariance snr If vector (see for instance \20\ [T9] ) . For the fast 
fading channel, assuming statistical channel state information at the trans- 
mitter, the ergodic capacity is given by 



E 



log det(Ir + snrHH*) = E Tr log(Ir + sniUH 



where in the last equality we use the fact that Trlog(-) = logdet(-). We 
refer the reader to [201 or 1191 for more details on this. 



Another important performance measure is the minimum mean square er- 
ror (MMSE) achieved by a linear receiver, which determines the maximum 
achievable output signal to interference and noise ratio (SINR). For an input 
vector X with i.i.d. entries of zero mean and unit variance the MSE at the 
output of the MMSE receiver is given by 

-ll 



min E 



|x - My| 



E 



Tr It-FsnrH*H 



where the expectation on the left hand side is over both the vectors x and 
the random matrices H, while the right hand side is over H only. We refer 
to |19] for more details on this. 

There is a big literature and history of work on averages over Gaussian en- 
sembles; see for instance [SOlIISllISlIIllElIIlIIIlIIOlIISlElElEland references 
therein. In [2U| the capacity of the Gaussian channel was computed as an 
improper integral. This integral is difficult to compute and asymptotic and 
simulation results are provided. In [31 O [T31 dHl EZ] several asymptotic re- 
sults for large complex Gaussian random matrices are studied in connection 
with wireless communication and information theory. In [13] many aspects 
of correlated Gaussian matrices are addressed, in particular the capacity 
of Rayleigh channel was computed as the number of antennas increases to 
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infinity. The books [19^ [TT| |T] are excellent introductions to random matrix 
theory and their appHcations to physics and information theory. In jlO] the 
spectral eigenvalue distribution for a random infinite d-regular graph was 
computed. 

The typical approach in computing averages over random matrices is to con- 
sider the asymptotic behavior as the size of the matrix increases to infinity. 
In this work we contribute to this area by providing a unified framework 
to express the ergodic mutual information, the MSE at the output of the 
MMSE decoder and other types of functionals of a single user MIMO chan- 
nel, when the number of transmitting and receiving antennas are equal and 
finite. We do not rely on asymptotic results as the number of antennas 
increases. The results shown in this work are new and novel to the best 
knowledge of the author and they were not discovered before. 



In Section [2j we present some preliminaries in Schur polynomials that are 
later used in this work. In Section [3j we prove the main result of the paper, 
This Theorem provides a new formula for the expectation 



Theorem 13.2 



E 



(3) 



Tr(^/(HAK 

where A is positive definite matrix and / a continuous function such that 



for every a > 0. Notice that, as previously stated, taking f{x) = log(l -|- x) 
gives another formula for the capacity of the MIMO communication channel, 
and taking f{x) = (1 -|- x)~^ gives the MMSE achieved by a linear receiver. 
We also discuss some applications and present some examples. 



2. ScHUR Polynomials Preliminaries 



A symmetric polynomial is a polynomial P{xi,X2, ■ ■ ■ ,Xn) in n variables 
such that if any of the variables are interchanged one obtains the same 
polynomial. Formally, P is a symmetric polynomial if for any permutation 
a of the set {1,2,..., n} one has 

PiXcT(l),Xa{2), ■ ■ • = Pixi,X2, ■ ■ ■ , Xn) ■ 

Symmetric polynomials arise naturally in the study of the relation between 
the roots of a polynomial in one variable and its coefficients, since the co- 
efficients can be given by a symmetric polynomial expressions in the roots. 
Symmetric polynomials also form an interesting structure by themselves. 
The resulting structures, and in particular the ring of symmetric functions, 
are of great importance in combinatorics and in representation theory (see 
for instance [U [12l |8l [H] for more on details on this topic) . 
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Figure 1. Young tableaux representation of the partition (5,4, 1). 



The Schur polynomials are certain symmetric polynomials in n variables. 
This class of polynomials are very important in representation theory since 
they are the characters of irreducible representations of the general linear 
groups. The Schur polynomials are indexed by partitions. A partition of 
a positive integer n, also called an integer partition, is a way of writing n 
as a sum of positive integers. Two partitions that differ only in the order 
of their summands are considered to be the same partition. Therefore, we 
can always represent a partition A of a positive integer n as a sequence of n 
non-increasing and non-negative integers di such that 



E 

i=l 



di 



n 



with di > d2 > d^ > . . . > dn > 0. 



Notice that some of the di could be zero. Integer partitions are usually 
represented by the so called Young's tableaux (also known as Ferrers' dia- 
grams). A Young tableaux is a finite collection of boxes, or cells, arranged in 
left-justified rows, with the row lengths weakly decreasing (each row has the 
same or shorter length than its predecessor). Listing the number of boxes 
on each row gives a partition A of a non-negative integer n, the total number 
of boxes of the diagram. The Young diagram is said to be of shape A, and 
it carries the same information as that partition. For instance, in Figure [T] 
we can see the Young tableaux corresponding to the partition (5, 4, 1) of the 
number 10. 



Given a partition A of n 

n = di + d2 + 



+ d„ 



di>d2>--->dn>0 



the following functions are alternating polynomials (in other words they 
change sign under any transposition of the variables): 



det 



''I 



rd2 



"1 -^2 
cr&Sn 



I' I \ 
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Figure 2. Young tableaux representation of the partition 
(5,4, 1) with its corresponding hook's lengths. 



where S'„ is the permutation group of the set {1,2,..., n}. Since they are 
alternating, they are all divisible by the Vandermonde determinant 

A(xi,...,Xn) = JJ (Xj-Xk). 

l<j<k<n 

The Schur polynomial associated to A is defined as the ratio: 

'^(dl+n-l,d2+n-2,...,d„+0)(^l) • • • ) ^n) 



Sx{xi,X2,. . . ,Xn) 



A(xi, ...,Xr, 



This is a symmetric function because the numerator and denominator are 
both alternating, and a polynomial since all alternating polynomials are 
divisible by the Vandermonde determinant (see [U 13 E] for more details 
here). For instance, 

■5(2,1,1) (a^i>a:^2,a;3) = X1X2X3 (xi + X2 + X3) 

and 



/ \ 22|22|22|2 

S(2,2fl){Xl,X2,X3) = X^X2+ XiX^ + X2X3 + X1X2X3 

+ Xixlx3 + XiX2xl. 



Another definition we need for the next Section is the so called hook length, 
hook(x), of a box x in Young diagram of shape A. This is defined as the 
number of boxes that are in the same row to the right of it plus those boxes 
in the same column below it, plus one (for the box itself). For instance, in 
Figure [2] we can see the hook lengths of the partition (5,4, 1). The product 
of the hook's length of a partition is the product of the hook lengths of all 
the boxes in the partition. 

We recommend the interested reader to consult [H [U [H] for more details 
and examples on this topic. 
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3. Averages over Gaussian Ensembles 



Let Mn be the set of all n x n complex matrices and U„ the set of n x n 
unitary complex matrices. Let dH be the Lebesgue measure on M„ and let 

di/(H) = vr""" exp(^ - trace(H*H)) dH 

be the Gaussian measure on M„. This is the induced measure by the Gauss- 
ian random matrix with complex independent and identically distributed 
entries with zero mean and unit variance in the set of matrices, when this 
is represented as an Euclidean space of dimension 2n^ . Note that this prob- 
ability measure is left and right invariant under unitary multiplication (i.e., 
fiz^(HU) = (iz^(UH) = (iz^(H) for every unitary U). The following Theorem 
can be found in page 447 of [8j. 

Theorem 3.1. [8j For every A, B Hermitian n x n matrices and every 
partition A 

[ SA(AH*BH)dz.(H) = /i(A)sA(A)sA(B) (4) 
where h(X) is the product of the hook-lengths of X. 



Denote by (m — k, 1^) the partition {m — A;, 1, 1, . . . , 1) with k ones. It is 
a well known fact in matrix theory (see [4J or [8j) that for every Hermitian 
n X n matrix A and for every integer m 



n-l 

Tr(A'") = j;(-l)'=.(^_,,i.)(A). (5) 

fc=0 

Note that for the case 1 < m < n, even though the sum is up to the n — 1 
term, all the terms between min{n, m} and n — 1 are zero. In particular, 



. Tr(A) = .(!)( A) 

. Tr(A2) = S(2,o)(A)-S(i,i)(A) 

• TV(A3) = S(3,o)(A) - S(2,i)(A) + S(i,i,i)(A) 

• Tv{A^) = S(4,o)(A) - S(3^i)(A) + S(2,i,i)(A) - S(i,i,i,i)(A). 

The constant S(m-A;,i'=)(Ip) is equal to 

^ im + p-{k + l))\ 

(see [8] for a proof of this formula). Therefore, 

g(m-fc,ifc)(Ip) _ {m+p-{k + l)y. {n-{k + l))\ 
5(m-fe,i^)(In) ~ {m + n-{k + 1))! ' {p-{k + 1))! ' 
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For every a > let us define the following class of functions 

: = |/ : — t- M : measurable and such that 

e-"*|/(t)|2dt < oo}. 
This is a Hilbert space with respect to the inner product 

/•oo 

{f,9)a= / e-'''f{t)git)dt. 
Jo 

Moreover, polynomials are dense with respect to this norm (see Chapter 10 
in [13 )• Let Aa be the set of continuous functions in and let A be the 
intersection of all the Aa, 

A = r\a>oAa- 

Note that the family ^ is a very rich family of functions. For instance, all 
functions that do not grow faster than polynomials belong to these family. 
In particular, f{t) = log(l + t) £ A. 



Theorem 3.2. Let A be annxn positive definite matrix and let {di, . . . , dn} 
be the set of eigenvalues of A. Assume that all the eigenvalues are different. 
Then for every f £ A we have that 

where A(D) is the Vandermonde matrix associated with the matrix D = 
diag(di, . . . , dn) and T^ is the matrix constructed by replacing the {k + 1) 

rowofA{B) ridr^'+'^ILJ by 

^ Afkm7=i 



where 



(n-(fe + l))! 

/•oo 







Proof. First, we will prove the Theorem for polynomials. Let p and q be 
two polynomials. It is clear that 

Tr((p + g)(H*AH)) = Tr(p(H*AH)) + Tr(g(H*AH)) 

and {p + q)k = Pk + Qk for every A; = 0, . . . , n — 1. Therefore, both sides of 
the Equation ([T]) are linear and it is enough to prove the Theorem for the 
case p{x) = x"^ with m > 0. Using Theorem 3.1 and Equation ^ we see 



that for every positive definite n x n matrix A, the average 

/ Ti({U*AUr)du{U) 
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is equal to 



n-l 



k=0 

where is the partition (m — A;, 1^). 

It is weh known (see that for every partition A — (Ai, • ■ • 5 ^n) 

Ai - Xj + j - i 



Sx(i-n) 



n 

l<i<j<n 



J - « 



(8) 



Therefore, we can deduce that 



{m + n- {k + 1))! 



m k\ {n - {k + {m - {k + 



(9) 



We can see by direct examination that the hook-length of the partition Afc 
is equal to 



Hence, 



h{Xk) = fc!(m-(/c + l))!m. 

{m + n- {k + 1))! 



Sx^:{ln)h{Xk 



(n- (fc + l))! 



Since A is a positive definite matrix, by the spectral Theorem there exists 
U unitary and D = diag((ii, . . . diagonal such that A = UDU*. Note 
that the di are the eigenvalues of A. By definition of the Schur polynomials 



SA,(A) = SA,(D) 



det(Sfc) 
det(A(D)) 



where A(D) is the Vandermonde matrix associated with the sequence 
and Sfc is a matrix whose i-th column is equal to 



^n-(fc+l)+l 



n-{k+2) 



-n— (n— 1) 
1 
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It is easy to see that after k transpositions of the rows of the matrix Sfe we 
obtain a new matrix whose i-th column is equal to 

/ \ 



^n+m— (fe+1) 



d: 



,n-{k+2) 



d 



,n—{n—l) 



V 



This matrix is equal to the Vandermonde matrix A(D) except for the (fc + l) 

row, {d'l~^''~^^^}2^i, which is substituted by the row {d"^™"*^'^'''^^}"^!. Note 
also that 

det(Sfe) = (-l)*^det(Hfc). 

Therefore, the average 



/ 

Ja 



is equal to 



M„ 

n— 1 

E 



Tr((H*AH)'") di^in) 

{m + n- {k + l))\ 



det(A(D))^^ (n-(A; + l))! 



det(Hfe). 



Using the fact that Jq°° e ^tPdt=p\ and the definition of Pk{x) for the case 
p{x) = we see that 

Pk{x): = / e-*(te)"+"*-('=+^)di 

= (m + n- (fc + l))! ar"*+"-(*^+i). 



Therefore, our claim holds and we have proven the result for all polyno- 
mials. Now consider f & A and let P be the maximum eigenvalue, i.e., 
j3 = max{di, . . . , dn}- Define a = Since f e A, then / G Aa and let 
{p^^^r>i be a sequence of polynomials such that ||/— p^''^ \\a 0- Let T^"^ be 
the matrix constructed by replacing the (A; + l) row of A(D) {{d^~^''^^^}2=i) 
by 



{n-{k + l))\ 
where 



pI\x):= / e-\txY-^''+^^p^''\tx)dt. 
Jo 
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Let Tk be the matrix constructed by replacing the {k + 1) row of A(D) by 

7 , -I {/fc('^i)}r=i 

[n- {k + l))l 

where 

POO 

fk{x) := / e-*(te)"-(^+i)/(te)dt. 
Jo 

To prove that Equation ([T]) holds it is enough to prove that 

det(Ti")) ^det(Tfc) 

as n — )• oo for every k = 0, l,...,n — 1. For this, it is enough to prove that 
Pk'\^i) ~^ fk{di) for every k and every i = 1, 2, . . . , n. Note that 

\f^{di)-pt\di)\ 

< ^n-('=+l)^(2(^_(^+l))),. 

) 

e-'\f{tdi)-p'^''\tdi)\^dt] 
dr^'+^V(2(n-(fc + !)))!• 

e~^^\m-p^^Ht)\'dt^' 





where we use Cauchy-Schwartz for the second inequality and change of vari- 
able for the last one. Now, by construction the sequence {p^^^} satisfies 

hm ||/-p('-)||^= hm / e-°*|/(t)-pW(t)|2(it = 

and a < d^^. Hence, we see that 

hm \fk{di) - pt\di)\ = 
finishing the proof. □ 



Remark 3.3. We would like to observe that the case when not all the eigen- 
values are different can he treated as above by perturbing of the original 
eigenvalues and applying a subsequent limit. We present an instance of this 
situation in Corollary\37t 



As a consequence we have a new formula for the capacity of the MIMO 
communication channel and for the MMSE described in the introduction. 
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Corollary 3.4. Let A be as in Theorem\3.^ Then 



I T\-(log(I„ + H*AH))di/(H) 



is equal to 

n-l 



fc=0 

where is the matrix constructed by replacing the {k + 1) row of A(D) 

n 



1 



oo 



[(n-(A: + l))!yo 

Corollary 3.5. Let A be as in theorem \3.^ Then 

[ Tr({In + U*AU)-')du{U) 

is equal to 

^ n—\ 

where is the matrix constructed by replacing the {k + 1) row of A(D) 

({dr^'^'^}ti) by 

-\td^)''-'•''+^\l + tdi)-Ut\ . 

) i=l 



1 

{n-{k + l))lj. 



As an application let us compute explicitly the two dimensional case for the 
capacity. 

Corollary 3.6. Let A be an Hermitian 2x2 matrix with eigenvalues di 
and d2 ■ If di ^ d2 then 

[ TV ( log(l2 + H* AH)) di/(H) 

is equal to 

fojdl) - fo{d2) + rfl/l(d2) - d2/l(rfl) 

di - d2 

where /o(c?i) = /q°° e'^tdi log(l + tdi) dt and fi{di) = e'* log(l + tdi) dt. 
If di = d2 = d then 



[ Tr ( log(l2 + d • H*H)) dz^(H) 

JM2 ^ ' 
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is equal to 



(1 + t) log(l + td) + 



td{t - 1)- 
1 + td . 



dt. 



Proof. The case di 7^ ^2 is a direct application of theorem 3.2 for n = 2 and 
f{x) = log(l + x). For the case di = d2 = d then both the top and the 
bottom vanish and we have to take the hmit of = d + e and d2 = d as 
e — )• 0. More precisely, 



lim 



fo{d + e)-fo{d) 



tlog{l + td) + 



t'd 
1 + td 



dt 



and 



lim 



{d + e)h{d)-dh{d + e) 



(l + d) \og{l + td) 



td 



l + td 



dt. 



Putting all the pieces together we finish the proof. 



□ 



Analogously, we can compute explicitly the moments for the two dimensional 
case. 



Theorem 3.7. Let A be an Hermitian 2x2 matrix with eigenvalues di and 
d2 and letm>\. If di ^ d2 then 



[ Tr((H*AH)'") diy{U) 



is equal to 



m! m + 1 + 

\ di -d2 di- d2 I 

If di = d2 = d then 

[ Tr((H*AH)'")dz^(H) =m!(m2 + m + 2)(i" 

Jm2 ^ ^ 



4. Conclusion 



Using results on random matrix theory and representation theory, in particu- 
lar Schur polynomials, we prove a new formula for the average of functionals 
over the Gaussian ensemble. In particular, this gives another formula for 
the capacity of the MIMO Gaussian channel and the MMSE achieved by a 
linear receiver. 
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